pacman::p_load(sf, tmap, tidyverse, dplyr, readr, plotly, seriation, dendextend, heatmaply)take home exercise 4
Purpose:
we want to find out the present rent price in Singapore Market.
Data preparation:
The data we download from SG HDB website.
Retentialdata <- read_csv("data/ResidentialRental/20241.csv")we change the variable more easy to read.
Retentialdata2024 <- Retentialdata %>%
select(
PostalDistrict = `Postal District`,
NoOfBedroom = `No of Bedroom`,
MonthlyRent = `Monthly Rent ($)`,
StreetName = `Street Name`,
ProjectName1 = `Project Name`)postdistrictdata <- read_csv("data/ResidentialRental/realis2018.csv") %>%
select(postalDistrict2 = "Postal District",
PlanningArea = "Planning Area",
PlanningRegion = "Planning Region",
ProjectName2 = "Project Name") %>%
distinct()Data Merge:
We use left_join to merge the data, because we want to add “PlanningArea”.
Retentialdata2024merged <- Retentialdata2024 %>%
left_join(postdistrictdata, by = c("ProjectName1" = "ProjectName2")) %>%
drop_na() %>%
mutate(PlanningArea = case_when(
PlanningArea == "YISHUN" ~ "Yishun",
TRUE ~ PlanningArea
)) We want to select the NoOfBedroom 1-4.
filtered_data <- Retentialdata2024merged %>%
filter(NoOfBedroom %in% 1:4) %>%
group_by(PlanningArea, NoOfBedroom) %>%
summarise(Median_Rent = median(MonthlyRent, na.rm = TRUE)) %>%
ungroup() %>%
pivot_wider(names_from = NoOfBedroom, values_from = Median_Rent)rent2024 <- filtered_data %>%
pivot_longer(-PlanningArea, names_to = "NoOfBedroom", values_to = "Median_Rent")rent2024room1 <- rent2024 %>% filter(NoOfBedroom == 1)The LLNEW downloads from Tableau, then we change the format that R can read.
LLNEW <- read_csv("data/ResidentialRental/LLNEW.csv")%>%
select(PlanningArea1 = "Planning Area",
Latitude = "Latitude (generated)",
Longitude = "Longitude (generated)")rent2024_joined <- rent2024 %>%
filter(NoOfBedroom %in% c("1", "2", "3", "4")) %>%
left_join(LLNEW, by = c("PlanningArea" = "PlanningArea1"))rent_wide <- rent2024_joined %>%
pivot_wider(
id_cols = c("PlanningArea", "Latitude", "Longitude"),
names_from = NoOfBedroom,
values_from = Median_Rent
)write.csv(rent_wide, "rent2024_wide_format.csv", row.names = FALSE)rent2024room1_joined <- rent2024room1 %>%
left_join(LLNEW, by = c("PlanningArea" = "PlanningArea1"))MAP:
We use the sf package to convert the data to geospatial data.
rent2024room1_joined_sf <- st_as_sf(rent2024room1_joined,
coords = c("Longitude",
"Latitude"),
crs = 4326) %>%
st_transform(crs = 3414)We use the mpsz dataset in the mapping function to visualize geographic data.
mpsz <- st_read(dsn = "data/geospatial",
layer = "MP14_SUBZONE_WEB_PL") %>%
st_transform(crs = 3414)Reading layer `MP14_SUBZONE_WEB_PL' from data source
`/Users/linxu/JANXULIN/ISSS608/take home exercise 4/data/geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 323 features and 15 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
Projected CRS: SVY21
We divide rooms into 1, 2, 3, and 4 categories, and plot the median values on the SG map, marking the points of planning areas with longitude and latitude (this part of the data comes from importing data from Tableau). Then, we merge the data using left_join and map it onto the map. (In R, we can allow customers to select 1 room, 2 rooms, 3 rooms, or 4 rooms themselves.)
p1<-tmap_options(check.and.fix = TRUE)
tmap_mode("view")
tm_shape(mpsz) +
tm_borders() +
tm_shape(rent2024room1_joined_sf) +
tm_dots(col= "Median_Rent") tmap_mode("plot")At the same time, we create a heatmap to display the median price of each area, allowing customers to better understand the overall area rental market prices in Singapore.
p2<-ggplot(rent2024, aes(x = PlanningArea, y = NoOfBedroom, fill = Median_Rent)) +
geom_tile() +
labs(x = "Planning Area", y = "Number of Bedrooms", fill = "Median Monthly Rent") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
plotly_p <- ggplotly(p2, tooltip = "fill")
plotly_p